衡量移动数据的客户体验对于全球移动运营商来说至关重要。收到的参考信号(RSRP)是当前移动网络管理,评估和监视的重要指标之一。通过最小化驱动器测试(MDT)(一种3GPP标准技术)收集的无线电数据通常用于无线网络分析。在不同地理区域收集MDT数据效率低下,受地形条件和用户的存在限制,因此对于动态无线电环境来说不是足够的技术。在本文中,我们研究了RSRP预测,利用MDT数据和数字双胞胎(DT)的生成模型,并提出了数据驱动的两层神经网络(NN)模型。在第一层中,与用户设备(UE)相关的环境信息,基站(BS)和网络关键性能指标(KPI)是通过变量自动编码器(VAE)提取的。第二层被设计为可能性模型。在这里,采用了环境功能和实际MDT数据功能,制定了集成的培训过程。在验证中,我们提出的使用现实世界数据的模型表明,与经验模型相比,与完全连接的预测网络相比,与经验模型相比,精度提高了约20%或更多。
translated by 谷歌翻译
下一代网络将积极采用人工智能(AI)和机器学习(ML)技术,用于自动化网络和最佳网络操作策略。以Open Ran(O-Ran)为代表的新兴网络结构符合这一趋势,其规范中心的无线电智能控制器(RIC)用作ML应用程序主机。各种ML模型,尤其是强化学习(RL)模型,被认为是解决与RAN相关的多目标优化问题的关键。但是,应该认识到,当前大多数RL成功都局限于抽象和简化的仿真环境,这可能不会直接转化为复杂的真实环境中的高性能。主要原因之一是模拟与真实环境之间的建模差距,这可能会使RL代理通过模拟训练不适合真实环境。此问题称为SIM2REAL差距。本文在O-Ran的背景下引起了SIM2REAL挑战。具体而言,它强调了数字双胞胎(DT)可以作为模型开发和验证的地方的特征和好处。提出了几种用例,以举例说明并证明在真实环境中训练有训练的RL模型的故障模式。讨论了DT在协助RL算法开发方面的有效性。然后提出了通常用于克服SIM2REAL挑战的基于学习的基于艺术学习的方法。最后,从数据交互,环境瓶颈和算法设计等潜在问题的角度讨论了O-RAN中RL应用程序实现的开发和部署问题。
translated by 谷歌翻译
无线电接入网络(RAN)技术继续见证巨大的增长,开放式运行越来越最近的势头。在O-RAN规范中,RAN智能控制器(RIC)用作自动化主机。本文介绍了对O-RAN堆栈相关的机器学习(ML)的原则,特别是加强学习(RL)。此外,我们审查无线网络的最先进的研究,并将其投入到RAN框架和O-RAN架构的层次结构上。我们在整个开发生命周期中提供ML / RL模型面临的挑战的分类:从系统规范到生产部署(数据采集,模型设计,测试和管理等)。为了解决挑战,我们将一组现有的MLOPS原理整合,当考虑RL代理时,具有独特的特性。本文讨论了系统的生命周期模型开发,测试和验证管道,称为:RLOPS。我们讨论了RLOP的所有基本部分,包括:模型规范,开发和蒸馏,生产环境服务,运营监控,安全/安全和数据工程平台。根据这些原则,我们提出了最佳实践,以实现自动化和可重复的模型开发过程。
translated by 谷歌翻译
Due to the environmental impacts caused by the construction industry, repurposing existing buildings and making them more energy-efficient has become a high-priority issue. However, a legitimate concern of land developers is associated with the buildings' state of conservation. For that reason, infrared thermography has been used as a powerful tool to characterize these buildings' state of conservation by detecting pathologies, such as cracks and humidity. Thermal cameras detect the radiation emitted by any material and translate it into temperature-color-coded images. Abnormal temperature changes may indicate the presence of pathologies, however, reading thermal images might not be quite simple. This research project aims to combine infrared thermography and machine learning (ML) to help stakeholders determine the viability of reusing existing buildings by identifying their pathologies and defects more efficiently and accurately. In this particular phase of this research project, we've used an image classification machine learning model of Convolutional Neural Networks (DCNN) to differentiate three levels of cracks in one particular building. The model's accuracy was compared between the MSX and thermal images acquired from two distinct thermal cameras and fused images (formed through multisource information) to test the influence of the input data and network on the detection results.
translated by 谷歌翻译
Temporal action segmentation tags action labels for every frame in an input untrimmed video containing multiple actions in a sequence. For the task of temporal action segmentation, we propose an encoder-decoder-style architecture named C2F-TCN featuring a "coarse-to-fine" ensemble of decoder outputs. The C2F-TCN framework is enhanced with a novel model agnostic temporal feature augmentation strategy formed by the computationally inexpensive strategy of the stochastic max-pooling of segments. It produces more accurate and well-calibrated supervised results on three benchmark action segmentation datasets. We show that the architecture is flexible for both supervised and representation learning. In line with this, we present a novel unsupervised way to learn frame-wise representation from C2F-TCN. Our unsupervised learning approach hinges on the clustering capabilities of the input features and the formation of multi-resolution features from the decoder's implicit structure. Further, we provide the first semi-supervised temporal action segmentation results by merging representation learning with conventional supervised learning. Our semi-supervised learning scheme, called ``Iterative-Contrastive-Classify (ICC)'', progressively improves in performance with more labeled data. The ICC semi-supervised learning in C2F-TCN, with 40% labeled videos, performs similar to fully supervised counterparts.
translated by 谷歌翻译
We propose Panoptic Lifting, a novel approach for learning panoptic 3D volumetric representations from images of in-the-wild scenes. Once trained, our model can render color images together with 3D-consistent panoptic segmentation from novel viewpoints. Unlike existing approaches which use 3D input directly or indirectly, our method requires only machine-generated 2D panoptic segmentation masks inferred from a pre-trained network. Our core contribution is a panoptic lifting scheme based on a neural field representation that generates a unified and multi-view consistent, 3D panoptic representation of the scene. To account for inconsistencies of 2D instance identifiers across views, we solve a linear assignment with a cost based on the model's current predictions and the machine-generated segmentation masks, thus enabling us to lift 2D instances to 3D in a consistent way. We further propose and ablate contributions that make our method more robust to noisy, machine-generated labels, including test-time augmentations for confidence estimates, segment consistency loss, bounded segmentation fields, and gradient stopping. Experimental results validate our approach on the challenging Hypersim, Replica, and ScanNet datasets, improving by 8.4, 13.8, and 10.6% in scene-level PQ over state of the art.
translated by 谷歌翻译
Terabytes of data are collected every day by wind turbine manufacturers from their fleets. The data contain valuable real-time information for turbine health diagnostics and performance monitoring, for predicting rare failures and the remaining service life of critical parts. And yet, this wealth of data from wind turbine fleets remains inaccessible to operators, utility companies, and researchers as manufacturing companies prefer the privacy of their fleets' turbine data for business strategic reasons. The lack of data access impedes the exploitation of opportunities, such as improving data-driven turbine operation and maintenance strategies and reducing downtimes. We present a distributed federated machine learning approach that leaves the data on the wind turbines to preserve the data privacy, as desired by manufacturers, while still enabling fleet-wide learning on those local data. We demonstrate in a case study that wind turbines which are scarce in representative training data benefit from more accurate fault detection models with federated learning, while no turbine experiences a loss in model performance by participating in the federated learning process. When comparing conventional and federated training processes, the average model training time rises significantly by a factor of 7 in the federated training due to increased communication and overhead operations. Thus, model training times might constitute an impediment that needs to be further explored and alleviated in federated learning applications, especially for large wind turbine fleets.
translated by 谷歌翻译
Generating new fonts is a time-consuming and labor-intensive, especially in a language with a huge amount of characters like Chinese. Various deep learning models have demonstrated the ability to efficiently generate new fonts with a few reference characters of that style. This project aims to develop a few-shot cross-lingual font generator based on AGIS-Net and improve the performance metrics mentioned. Our approaches include redesigning the encoder and the loss function. We will validate our method on multiple languages and datasets mentioned.
translated by 谷歌翻译
We present ObjectMatch, a semantic and object-centric camera pose estimation for RGB-D SLAM pipelines. Modern camera pose estimators rely on direct correspondences of overlapping regions between frames; however, they cannot align camera frames with little or no overlap. In this work, we propose to leverage indirect correspondences obtained via semantic object identification. For instance, when an object is seen from the front in one frame and from the back in another frame, we can provide additional pose constraints through canonical object correspondences. We first propose a neural network to predict such correspondences on a per-pixel level, which we then combine in our energy formulation with state-of-the-art keypoint matching solved with a joint Gauss-Newton optimization. In a pairwise setting, our method improves registration recall of state-of-the-art feature matching from 77% to 87% overall and from 21% to 52% in pairs with 10% or less inter-frame overlap. In registering RGB-D sequences, our method outperforms cutting-edge SLAM baselines in challenging, low frame-rate scenarios, achieving more than 35% reduction in trajectory error in multiple scenes.
translated by 谷歌翻译
We propose ClipFace, a novel self-supervised approach for text-guided editing of textured 3D morphable model of faces. Specifically, we employ user-friendly language prompts to enable control of the expressions as well as appearance of 3D faces. We leverage the geometric expressiveness of 3D morphable models, which inherently possess limited controllability and texture expressivity, and develop a self-supervised generative model to jointly synthesize expressive, textured, and articulated faces in 3D. We enable high-quality texture generation for 3D faces by adversarial self-supervised training, guided by differentiable rendering against collections of real RGB images. Controllable editing and manipulation are given by language prompts to adapt texture and expression of the 3D morphable model. To this end, we propose a neural network that predicts both texture and expression latent codes of the morphable model. Our model is trained in a self-supervised fashion by exploiting differentiable rendering and losses based on a pre-trained CLIP model. Once trained, our model jointly predicts face textures in UV-space, along with expression parameters to capture both geometry and texture changes in facial expressions in a single forward pass. We further show the applicability of our method to generate temporally changing textures for a given animation sequence.
translated by 谷歌翻译